{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data Preparation\n",
    "\n",
    "Photon and spacecraft data are all that a user needs for the analysis. For the definition of LAT data products, see the information in the [Cicerone](https://fermi.gsfc.nasa.gov/ssc/data/analysis/documentation/Cicerone/Cicerone_Data/LAT_DP.html).\n",
    "\n",
    "The LAT data can be extracted from the Fermi Science Support Center web site as described in section [Extract LAT data](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/extract_latdata.html). Preparing these data for analysis depends on the type of analysis you wish to perform (e.g. point source, extended source, GRB spectral analysis, timing analysis, etc). The different cuts to the data are described in detail in the [Cicerone](https://fermi.gsfc.nasa.gov/ssc/data/analysis/documentation/Cicerone/Cicerone_Data/LAT_DP.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Data preparation consists of two steps:\n",
    "* ([gtselect](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtselect.txt)): Used to make cuts based on columns in the event data file such as time, energy, position, zenith angle, instrument coordinates, event class, and event type (new in Pass 8).\n",
    "* ([gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt)): In addition to cutting the selected events, gtmktime makes cuts based on the spacecraft file and updates the GTI extension.\n",
    "\n",
    "\n",
    "Here we give an example of how to prepare the data for the analysis of a point source. For your particular source analysis you have to prepare your data performing similar steps, but with the cuts suggested in Cicerone for your case."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Event Selection with gtselect\n",
    "\n",
    "In this section, we look at making basic data cuts using [gtselect](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtselect.txt). By default, gtselect prompts for cuts on:\n",
    "* Time\n",
    "* Energy\n",
    "* Position (RA,Dec,radius)\n",
    "* Maximum Zenith Angle\n",
    "\n",
    "However, by using hidden parameters defined on the command line (or using the '_Show Advanced Parameters_' check box in GUI mode), you can also make cuts on:\n",
    "\n",
    "* Event class ID\n",
    "* Event type ID (selects on conversion type, angular or energy reconstruction quality)\n",
    "* Minimum pulse phase\n",
    "* Maximum pulse phase"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this example, we use data that was extracted using the procedure described in the [Extract LAT Data](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/extract_latdata.html) tutorial. The original selection used the following information:\n",
    "\n",
    "* Search Center (RA,DEC) = (193.98,-5.82)\n",
    "* Radius = 20 degrees\n",
    "* Start Time (MET) = 239557417 seconds (2008-08-04 T15:43:37)\n",
    "* Stop Time (MET) = 255398400 seconds (2009-02-04 T00:00:00)\n",
    "* Minimum Energy = 100 MeV\n",
    "* Maximum Energy = 500000 MeV\n",
    "\n",
    "The LAT operated in survey mode for that period of time. We provide the user with the photon and spacecraft data files extracted in the same method as described in the [Extract LAT data](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/extract_latdata.html) tutorial:\n",
    "\n",
    "1. L1506091032539665347F73_PH00.fits\n",
    "2. L1506091032539665347F73_PH01.fits\n",
    "3. L1506091032539665347F73_SC00.fits"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/data/dataPreparation/L1506091032539665347F73_PH00.fits\n",
    "!wget https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/data/dataPreparation/L1506091032539665347F73_PH01.fits\n",
    "!wget https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/data/dataPreparation/L1506091032539665347F73_SC00.fits"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir data\n",
    "!mv *.fits ./data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If more than one file was generated by the [LAT Data Server](https://fermi.gsfc.nasa.gov/cgi-bin/ssc/LAT/LATDataQuery.cgi), we will need to provide an input file list in order to use all the event data files in the same analysis. This text file can be generated by typing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!ls ./data/*_PH* > ./data/events.txt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cat ./data/events.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This input file can be used in place of a single input events (or FT1) file by placing an `@` symbol before the text filename. The output from [gtselect](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtselect.txt) will be a single file containing all events from the combined file list that satisfy the other specified cuts.\n",
    "\n",
    "For a simple point source analysis, it is recommended that you only include events with a high probability of being photons. This cut is performed by selecting \"source\" class events with the the [gtselect](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtselect.txt) tool by including the hidden parameter evclass on the command line. For LAT Pass 8 data, `source` events are specified as event class 128 (the default value).\n",
    "\n",
    "Additionally, in Pass 8, you can supply the hidden parameter `evtype` (event type) which is a sub-selection on `evclass`. For a simple analysis, we wish to include all front+back converting events within all PSF and Energy subclasses. This is specified as evtype 3 (the default value).\n",
    "\n",
    "The recommended values for both evclass and evtype may change as LAT data processing develops.\n",
    "\n",
    "Now run [gtselect](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtselect.txt) to select the data you wish to analyze. For this example, we consider the source class photons within a 20 degree acceptance cone of the blazar 3C 279. We apply the **gtselect** tool to the data file as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "gtselect evclass=128 evtype=3\n",
    "    @./data/events.txt\n",
    "    ./data/3C279_region_filtered.fits\n",
    "    193.98\n",
    "    -5.82\n",
    "    20\n",
    "    INDEF\n",
    "    INDEF\n",
    "    100\n",
    "    500000\n",
    "    90\n",
    "\n",
    "#### Parameters:\n",
    "# Input file or files (if multiple files are in a .txt file,\n",
    "#        don't forget the @ symbol)\n",
    "# Output file\n",
    "# RA for new search center\n",
    "# Dec or new search center\n",
    "# Radius of the new search region\n",
    "# Start time (MET in s)\n",
    "# End time (MET in s)\n",
    "# Lower energy limit (MeV)\n",
    "# Upper energy limit (MeV)\n",
    "# Maximum zenith angle value (degrees)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or, you can choose to run **gtselect** via python. It is named `filter` in the `gt_apps` module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from gt_apps import filter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "filter.pars()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "filter['infile'] = '@./data/events.txt'\n",
    "filter['outfile'] = './data/3C279_filtered.fits'\n",
    "filter['ra'] = 194.047\n",
    "filter['dec'] = -5.78931\n",
    "filter['rad'] = 15\n",
    "filter['tmin'] = 239557417\n",
    "filter['tmax'] = 255398400\n",
    "filter['emin'] = 100\n",
    "filter['emax'] = 100000\n",
    "filter['zmax'] = 90\n",
    "filter['evclass'] = 128\n",
    "filter['evtype'] = 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "filter.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The filtered data will be found in the file `./data/3C279_region_filtered.fits`.\n",
    "\n",
    "If you don't want to make a selection on a given parameter, just enter a zero (0) as the value.\n",
    "\n",
    "In this step we also selected the maximum zenith angle value as suggested in the [Cicerone](https://fermi.gsfc.nasa.gov/ssc/data/analysis/documentation/Cicerone/Cicerone_Data_Exploration/Data_preparation.html). Photons coming from the Earth limb are a strong source of background. You can minimize this effect with a zenith angle cut. The value of 90 degrees is suggested for reconstructing events above 100 MeV and provides a sufficient buffer between your region of interest (ROI) and the Earth's limb.\n",
    "\n",
    "In the next step, [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) will remove any time period that our ROI overlaps this buffer region. While increasing the buffer (reducing zmax) may decrease the background rate from albedo gammas, it will also reduce the amount of time your ROI is completely free of the buffer zone and thus reduce the livetime on the source of interest."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Notes**:\n",
    "\n",
    "* The RA and Dec of the search center must exactly match that used in the dataserver selection. If they are not the same, multiple copies of the source position will appear in your prepared data file which will cause later stages of analysis to fail. See \"DSS Keywords\" below.\n",
    "\n",
    "\n",
    "* The radius of the search region selected here must lie entirely within the region defined in the dataserver selection. They can be the same values, with no negative effects.\n",
    "\n",
    "\n",
    "* The time span selected here must lie within the time span defined in the dataserver selection. They can be the same values with no negative effects.\n",
    "\n",
    "\n",
    "* The energy range selected here must lie within the time span defined in the dataserver selection. They can be the same values with no negative effects."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**BE AWARE**: [gtselect](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtselect.txt) writes descriptions of the data selections to a series of _Data Sub-Space_ (DSS) keywords in the `EVENTS` extension header.\n",
    "\n",
    "These keywords are used by the exposure-related tools and by [gtlike](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtlike.txt) for calculating various quantities, such as the predicted number of detected events given by the source model. These keywords MUST be same for all of the filtered event files considered in a given analysis.\n",
    "\n",
    "[gtlike](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtlike.txt) will check to ensure that all of the DSS keywords are the same in all of the event data files. For a discussion of the DSS keywords see the Data Sub-Space Keywords page."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are multiple ways to view information about your data file. For example:\n",
    "* You may obtain the value of start and end time of your file by using the fkeypar tool. This tool is part of the [FTOOLS](http://heasarc.nasa.gov/lheasoft/ftools/ftools_menu.html) software package and is used to read the value of a FITS header keyword and write it to an output parameter file. For more information on the fkeypar tool, type: "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`fhelp fkeypar`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* The [gtvcut](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtvcut.txt) tool can be used to view the DSS keywords in a given extension, where the EVENTS extension is assumed by default. This is an excellent way to to find out what selections have been made already on your data file (by either the dataserver, or previous runs of gtselect).\n",
    "\n",
    "    * NOTE: If you wish to view the (very long) list of good time intervals (GTIs), you can use the hidden parameter `suppress_gtis=no` on the command line. The full list of GTIs is suppressed by default."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Time Selection with gtmktime\n",
    "\n",
    "You may have noticed that all of these files have a GTI extension in them. Before we look at making selections with the [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) tool, we should probably clarify what at Good Time Interval (GTI) is:\n",
    "\n",
    "* Simply stated, a GTI is a time range when the data can be considered valid. The GTI extension contains a list of these GTI's for the file. Thus the sum of the entries in the GTI extension of a file corresponds to the time when the data in the file is \"good.\"\n",
    "\n",
    "How are these interpreted for Fermi?\n",
    "\n",
    "* The initial list of GTI's are the times that the LAT was collecting data over the time range you selected. The LAT does not collect data while the observatory is transiting the Southern Atlantic Anomaly (SAA), or during rare events such as software updates or spacecraft maneuvers.\n",
    "\n",
    "**Notes**:\n",
    "* Your object will most likely not be in the field of view during the entire time that the LAT was taking data.\n",
    "\n",
    "\n",
    "* Additional data cuts made with gtmktime will update the GTI's based on the cuts specified in both gtmktime and gtselect.\n",
    "\n",
    "\n",
    "* The Fermitools use the GTI's when calculating exposure. If these have not been properly updated, the exposure correction made during science analysis may be incorrect."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) is used to update the GTI extension and make cuts based on spacecraft parameters contained in the spacecraft (pointing and livetime history) file. It reads the spacecraft file and, based on the filter expression and specified cuts, creates a set of GTIs. These are then combined (logical and) with the existing GTIs in the Event data file, and all events outside this new set of GTIs are removed from the file. New GTIs are then written to the GTI extension of the new file.\n",
    "\n",
    "Cuts can be made on any field in the spacecraft file by adding terms to the filter expression using C-style relational syntax:\n",
    "\n",
    "    !->not, &&-> and, -> or, ==, !=, >, <, >=, <=\n",
    "\n",
    "    ABS(), COS(), SIN(), etc., also work\n",
    "\n",
    ">**NOTE**: Every time you specify an additional cut on time, ROI, zenith angle, event class, or event type using [gtselect](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtselect.txt), you must run [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) to reevaluate the GTI selection.\n",
    "\n",
    "Several of the cuts made above with **gtselect** will directly affect the exposure. Running [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) will select the correct GTIs to handle these cuts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is also especially important to apply a zenith cut for small ROIs (< 20 degrees), as this brings your source of interest close to the Earth's limb.There are two different methods for handling the complex cut on zenith angle.\n",
    "\n",
    "One method involves excluding time intervals where the buffer zone defined by the zenith cut intersects the ROI from the list of GTIs. In order to do that, run [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) and answer \"yes\" at the prompt:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```\n",
    "> gtmktime\n",
    "...\n",
    "> Apply ROI-based zenith angle cut [] yes\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">**NOTE**: If you are studying a very broad region (or the whole sky) you would lose most (all) of your data when you implement the ROI-based zenith angle cut in [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt).\n",
    ">\n",
    ">In this case you can allow all time intervals where the cut intersects the ROI, but the intersection lies outside the FOV. To do this, run _gtmktime_ specifying a filter expression defining your analysis region, and answer \"no\" to the question regarding the ROI-based zenith angle cut:\n",
    ">\n",
    ">`> Apply ROI-based zenith angle cut [] no`\n",
    ">\n",
    ">Here, RA_of_center_ROI, DEC_of_center_ROI and radius_ROI correspond to the ROI selection made with gtselect, zenith_cut is defined as 90 degrees (as above), and limb_angle_minus_FOV is (zenith angle of horizon - FOV radius) where the zenith angle of the horizon is 113 degrees."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can, instead, apply the zenith cut to the livetime calculation while running [gtltcube](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtltcube.txt). This is the method that is currently recommended by the LAT team (see the [Livetimes and Exposure](https://fermi.gsfc.nasa.gov/ssc/data/analysis/documentation/Cicerone/Cicerone_Likelihood/Exposure.html) section of the [Cicerone](https://fermi.gsfc.nasa.gov/ssc/data/analysis/documentation/Cicerone/)), and is the method we will use most commonly in these analysis threads. To do this, answer \"no\" at the [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) prompt:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`> Apply ROI-based zenith angle cut [] no`\n",
    "\n",
    "And instead, remember to use the hidden `zmax` parameter later when calculating the livetime cube:\n",
    "\n",
    "`> gtltcube zmax=90`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) also provides the ability to exclude periods when some event has negatively affected the quality of the LAT data. To do this, we select good time intervals (GTIs) by using a logical filter for any of the [quantities in the spacecraft file](https://fermi.gsfc.nasa.gov/ssc/data/analysis/documentation/Cicerone/Cicerone_Data/LAT_Data_Columns.html#SpacecraftFile). Some possible quantities for filtering data are:\n",
    "\n",
    "* `DATA_QUAL` - quality flag set by the LAT instrument team (1 = ok, 2 = waiting review, 3 = good with bad parts, 0 = bad)\n",
    "\n",
    "* `LAT_CONFIG` - instrument configuration (0 = not recommended for analysis, 1 = science configuration)\n",
    "\n",
    "* `ROCK_ANGLE` - can be used to eliminate pointed observations from the dataset.\n",
    "\n",
    ">**NOTE**: A history of the rocking profiles that have been used by the LAT can be found in the [observations](https://fermi.gsfc.nasa.gov/ssc/observations/types/allsky/) section."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The current [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) filter expression recommended by the LAT team is:\n",
    "\n",
    "**(DATA_QUAL>0)&&(LAT_CONFIG==1).**\n",
    "\n",
    ">**NOTE**: The \"DATA_QUAL\" parameter can be set to different values, based on the type of object and analysis the user is interested into (see this page of the Cicerone for the most updated detailed description of the parameter's values). Typically, setting the parameter to 1 is the best option. For GRB analysis, on the contrary, the parameter should be set to \">0\".\n",
    "\n",
    "Here is an example of running [gtmktime](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/help/gtmktime.txt) on the 3C 279 filtered events file. It is useful to rename the spacecraft file to something easier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mv ./data/L1506091032539665347F73_SC00.fits ./data/spacecraft.fits"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we have renamed our spacecraft fits file to `spacecraft.fits`.\n",
    "\n",
    "Now, we run **gtmktime**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "gtmktime \n",
    "    ./data/spacecraft.fits \n",
    "    (DATA_QUAL>0)&&(LAT_CONFIG==1) \n",
    "    no\n",
    "    ./data/3C279_region_filtered.fits \n",
    "    ./data/3C279_region_filtered_gti.fits\n",
    "    \n",
    "#### Parameters:\n",
    "# Spacecraft file\n",
    "# Filter expression\n",
    "# Apply ROI-based zenith angle cut\n",
    "# Event data file\n",
    "# Output event file name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!ls ./data/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data with all the cuts described above is provided in this [link](https://fermi.gsfc.nasa.gov/ssc/data/analysis/scitools/data/dataPreparation/3C279_region_filtered_gti.fits); it is the `3C279_region_filtered_gti.fits` file.\n",
    "\n",
    "After the data preparation, it is advisable to take a look at your data before beginning the detailed analysis. The [Explore LAT data](3.ExploreLATData.ipynb) tutorial has suggestions on methods of getting a quick preview of your data."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.14"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
